Bit-granular writes of control registers

ABSTRACT

In an example embodiment, a method writes individual bits of data to a register. Bits of data are received in a data field. The number of bits in the data field is equal to the number of bits in the register and the bit locations in the data field correspond respectively to the bit locations in the register. Enable bits are received in a bit enable field. The number of enable bits in the bit enable field is equal to the number of bits in the register. The bit locations in the bit enable field correspond respectively to bit locations in the register. Only the bits at the bit locations of the register for which the enable bit in the corresponding location in the bit enable field is set are overwritten with the bit in the corresponding location in the data field.

BACKGROUND

[0001] 1. Field of the Invention

[0002] This invention relates generally to data transfer operationsbetween computer devices. In particular, the invention relates tomethods of writing data to registers which control data transferoperations between devices in a computer system.

[0003] 2. Description of the Related Art

[0004] The host processor of a computer system fetches and executesinstructions which may cause the host processor to transfer data betweenthe memory, a central processing unit (CPU) and an arithmetic and logicunit (ALU) or to initiate input/output (I/O) data transfer operationswith I/O devices or peripherals external to the host processor. Thecomputer system typically includes at least one controller which acts asthe communications intermediary between the processor and one or moreI/O subsystems, which may each contain one or more external I/O devicesor peripherals. As a result transfer operations may not be optimized,and the wait time for processing data transferred through the computersystem may be unnecessarily lengthened. The controller may be containedin a bridge, such as an I/O Controller Hub (ICH) available from IntelCorporation of Santa Clara, Calif., provided to interface with andbuffer transfers of data between various computer devices.

[0005] The advanced technology attachment standard, frequently writtenas AT attachment (ATA) or integrated drive electronics (IDE), iscommonly used for power and data signal interface communications betweena host processor and a storage device. This set of standards is producedby Technical Committee T13 (www.t13.com) of the National Committee onInformation Technology Standards (www.NCITS.org), Washington, D.C. TheAT Attachment Interface for Disk Drives (ANSI X3.221-199×) is a diskdrive interface standard that specifies the logical characteristics ofthe interconnecting signals as well as the protocols and commands forthe storage device operation. This standard permits compatibilitybetween host system products and storage device products that complywith the standard, even where these products are produced by differentmanufacturers.

[0006] An IDE controller is conventionally located between any IDEstorage device (such as a hard disk drive) and the host processor. Itserves as a translator to facilitate C(PU/IDE device communications overeach I/O cycle. For example, on receiving an initialization command fromthe host processor, the IDE interface controller presents the commandinto something the downstream IDE device will understand, i.e. that theIDE device can handle, and sends this command to the attached IDEdevice. On receiving the converted command, the IDE device processes thecommand and sends back a completion notification to the processorthrough the IDE interface controller. This conventional I/O cycle fromcommand sent to completion notification is a single task-file registeraccess that may take approximately 1.2 microseconds (μs—one millionth(10⁻⁶) of a second).

[0007] Conventionally, the host processor dedicates a block of itsprocessing time to the initialization of a peripheral, such as an IDEstorage device. During this peripheral initialization dedication time,the host processor is prevented from performing other processingfunctions and thus its performance is slowed down. Furthermore, theperformance of a bridge or a controller may be burdened by demands toaccess memory locations and control registers during data transferoperations. Conventional control registers usually offer only byte-levelwrite control. Therefore, when software must write to a specific bit (orbits) of a byte in a control register, it must first read the byte,merge the bit (or bits) to be modified into the read byte, and thenwrite the modified byte back to the register. See FIG. 3. For typicalI/O and configuration registers, the processor is stalled approximately1 microsecond (about 1,000 processor clocks) while performing thisread-merge-write sequence (perhaps only to write a single bit). If theread-merge-write sequences are necessary for streamlining of theinitialization command sequence, then they inherently prevent thesequence from being posted to the controller for the external I/O deviceor peripheral.

BRIEF DESCRIPTION OF THE DRAWINGS

[0008] A better understanding and appreciation of the foregoing and ofthe attendant advantages of the present invention will become apparentfrom the following detailed description of example embodiments of theinvention. While the foregoing and following written and illustrateddisclosure focuses on disclosing example embodiments of the invention,it should be clearly understood that the same is by way of illustrationand example only and is not to be taken by way of limitation.

[0009]FIG. 1 is a generalized block diagram of an exemplary computersystem in which an example embodiment of the invention may be practiced.

[0010]FIG. 2 illustrates a prior art method of changing individual bitsin a control register.

[0011]FIG. 3 illustrates a method of writing individual bits to acontrol register according to an example embodiment of the invention.

[0012]FIG. 4 is a table illustrating an example in which individual bitsin a control register are overwritten.

[0013]FIG. 5 illustrates a read/write command setup protocol methodwhich may be used in conjunction with a method of writing individualbits to a control register according to an example embodiment of theinvention.

[0014]FIG. 6 illustrates a conventional input/output (I/O) task-fileaccess.

[0015]FIG. 7 illustrates a streamlining task-file access which may beused in conjunction with a method of writing individual bits to acontrol register according to an example embodiment of the invention.

DETAILED DESCRIPTION

[0016] While example embodiments are described herein, the presentinvention is applicable for use with all types of computer systems, I/Ocontrollers and devices, and chipsets, including any follow-up chipdesigns which link together such disparate computer devices asprocessors, peripherals, storage devices, and devices for datacommunications. For the sake of simplicity, discussions will concentratemainly on a desktop personal computer having several I/O unitsinterconnected to a host processor by an I/O controller hub (ICH), busesand interfaces, although the scope of the present invention is notlimited thereto. A wide variety of implementations, arrangements andconfigurations of computer systems (e.g., processors, bridges and I/Ounits) may be possible.

[0017] The system diagram of an exemplary desktop personal computer 100is shown in FIG. 1. Although desktop computer system 100 is shown inFIG. 1, the invention may be utilized with a wide range of processingsystems having I/O data transfer operations such as, but not limited to,a mainframe computer, a server, a radio, a television, a set-top box, amobile computer, such as a laptop, a satellite system, or otherelectronic device that processes information.

[0018] The desktop computer system 100 includes a host processorsubsystem 101 which may be comprised of one or more host processors(which may have respective associated cache memories) and a memorycontroller hub (MCH) 103 connected to the processor(s) by a hostprocessor front side bus 102. The host processor(s) may be, for example,any one of the Pentium® family of processors manufactured by theassignee of this application, Intel corp. of Santa Clara, Calif., butfor the sake of simplicity the host processor(s) are represented andreferred to merely as CPU 104. Regardless of the number of hostprocessors in processor subsystem 101, a single processor may operate ona single item (such as I/O data transfer operation), and the pluralityof processors may collectively operate on multiple items (I/O datatransfer operations) on a list at the same time.

[0019] Memory subsystem 106 is connected to MCH 103 through at least onememory bus 105 and stores information and instructions for use byprocessor subsystem 101. It has at least one memory element, which ispreferably a dynamic random-access-memory (DRAM), but may be substitutedfor by other types of memory. Memory subsystem 106 may include anystorage device that works toward holding data in a machine-readableformat.

[0020] The desktop computer system 100 may have a motherboard 108 as amain board of the computer system. Motherboard 108 may contain circuitryfor a processor, a keyboard, and a monitor as well as include slots toaccept additional circuitry. It may also have one or more buses, whichmay each be one of a set of conductors (wires, printed circuit boardtracks or connections in an integrated circuit) connecting the variousfunctional units on motherboard 108.

[0021] A graphics subsystem having the necessary video memory and otherelectronics to provide a bitmap display to a display device (such as acathode ray tube monitor, liquid crystal display, or flat panel display)is included with, or attached to, motherboard 108, or with or to othercomponents included with or attached to motherboard 108. The graphicssubsystem may be an Advanced Graphics Port (AGP) video card 107(including an AGP 4× graphics controller and a local memory on its owncircuit board) connected to MCH 103 via an AGP 2.0 bus 109 as shown.

[0022] The operating system of desktop computer system 100 may includeone or more device-specific drivers utilized to establish communicationwith I/O controllers, devices and peripherals, and perform functionscommon to most drivers, including, for example, initialization andconfiguration, resource management, send/receive I/O transactionmessages, direct memory access (DMA) transactions (e.g., read and writeoperations), queue management, memory registration, descriptormanagement, message flow control, and transient error handling andrecovery. Such software driver modules may be written using high-levelprogramming languages such as C, C++ and Visual Basic, and may beprovided on a tangible medium, such as a memory device, magnetic disk(fixed, floppy, and removable), other magnetic media such as magnetictapes; optical media such as CD-ROM disks, or via Internet download,which may be available to conveniently plug-in or download into anexisting installed operating system (OS). One or more such softwaredriver modules may also be bundled with the existing operating systemwhich may be activated by a particular I/O device driver.

[0023] An I/O controller hub (ICH) 110 is connected to MCH 103 by bus111. It operates to bridge or interface with a plurality of various I/Odevices and peripherals. Several different types of I/O devices andperipherals controllers may be attached to ICH 110, such as a PeripheralComponent Interconnect (PCI) bus 115 with a plurality of slots 1 16. PCIbus 115 may be a high performance 32 or 64 bit synchronous bus withautomatic configurability and multiplexed address, control and datalines as described in the latest version of “PCI Local BusSpecification, Revision 2.2” set forth by the PCI Special Interest Group(SIG) on Dec. 18, 1998 for add-on arrangements (e.g., expansion cards)with new video, networking, or disk memory storage capabilities. Othertypes of bus architecture such as Industry Standard Architecture (ISA)and Expanded Industry Standard Architecture (EISA) buses may also besupported through a Moon PCI-ISA bridge 117.

[0024] A low pin count interface (LPC I/F) 120 of ICH 110 may supportsuper I/O 121 for providing an interface with a plurality of I/O devices(not shown), including, for example, a keyboard controller forcontrolling operations of an alphanumeric keyboard, a cursor controldevice such as a mouse, track ball, touch pad, touch screen, joystick,digitizing tablet, a microphone, a mass storage device such as magnetictapes, hard disk drives (HDD), and floppy disk drives (FDD), and serialand parallel ports to printers, scanners, and display devices. LPC I/F120 may also support one or more firmware hubs 122, possibly overmultiplexed connections, other application specific integrated circuitchips (ASICs) 123, and a management/security controller 124.

[0025] As shown in FIG. 1, ICH 110 may have a plurality of USB ports125, which preferably collectively support both USB1 and USB2 protocols.ICH 110 may also support AC'97 Codec(s) 130 over an AC'97 2.1 bus, alocal area network controller 135, GPIO 140, power management 145,including clock generators 146, system management (TCO) 150 and one ormore SMBus device(s) over SMBus/I2C 155.

[0026] The exemplary, non-limiting, ICH 110 shown in FIG. I supportsboth a primary IDE and a secondary IDE. The bus may be a 16-bit bus. Oneskilled in the art will recognize that the bus may have more throughput,such as a 32-bit Peripheral Component Interconnect (PCI) bus. The busmay be a first channel having an ATA ribbon cable, one end connected toa storage device such as an IDE device, such as a master device, and theother connected to a second IDE device, such as a slave device. Eachribbon cable may be a 44/80 conductor cable or any suitable conductorcable. A secondary channel similar to bus 126 may be coupled to thecontroller so as to serve a second pair of master and slave devices.

[0027] Although not shown in FIG. 1, ICH 110 contains a plurality ofcontrollers for the supported devices connected thereto. Exemplarysupported I/O devices and peripherals include keyboards, input mouses,printers, scanners, display devices, hard disk drives, Compact Disk ReadOnly Memory (CD-ROM) drives, Compact Disk Read/Write (CD-RW) drives, andother types of storage devices. These controllers act as acommunications translator between the supported devices and processorsubsystem 101. They may include logic that runs protocol instructionsout onto the bus connecting ICH 110 to the device. One of thesecontrollers is an IDE interface controller or a controller compatible(including being backward compatible) with the IDE interface. One ofthese devices is a storage device that may require translation ofprocessor instructions and may employ information stored in a locationthat may be connected with desktop computer system 100. It may be a diskdrive that may be adapted to read and write at least one rigid magneticdata storage disk (hard disk) that rotates about a central axle. Despitethe particulars of this example embodiment of the invention involvingdata transfers between an IDE interface controller in ICH 110 and one ormore IDE storage devices, the invention is not limited thereto and maybe applied to data transfers between any type of controller and deviceconnected to the controller.

[0028] Desktop computer system 100 may be configured differently, oremploy some additional or different components, than as shown in FIG. 1.Although an ICH can be implemented by a variety of different components,an exemplary ICH is the Intel® 82801 BA I/O Controller Hub 2 (ICH2).Although ICH 110 includes example embodiments of the invention and thusdiffers from all known prior art components at least in that respect, itmay be otherwise similar to a previously available ICH and a member ofthe family including one or more previously available ICHs, such as theIntel® 82801 BA I/O Controller Hub 2 (ICH2). In addition, in anyparticular personal computer implementation, ICH 110 may integrate manyof the legacy and new standard I/O interfaces for that personal computereither presently existing or hereafter developed.

[0029] The method of making bit-granular writes to control registersaccording to the example embodiment of the invention is preferablyapplied specifically to the IDE controller of ICH 110. Software(preferably, a driver in the operating system software) running inprocessor subsystem accesses the IDE registers by running a transactionon the front side bus. The transaction is accepted and forwarded to ICH110 by MCH 103. The transaction may be initially decoded by one blockwithin ICH 110, and then forwarded to an IDE controller block within ICH110. IDE accesses are forwarded to the IDE controller block (Assumingthe IDE controller block is enabled for accesses) where further decodingis performed to determine the exact register bits to be accessed.

[0030] Register reads always result in a “completion packet” back to theprocessor. The read completion packet contains the data from theregister. These are passed back up through the Global Out block (GOunit) and the Hub Link block (L1 unit). Writes to registers, which arethe primary transactions of interest for this invention, may not requireany completion information passed back to the processor. Writes to “I/O”or “Configuration” space require completion packets, while writes to“Memory” space do not require completion packets. Note that the terms“I/O”, “Configuration” and “Memory” as used in these paragrapghs onlyrefer to a characteristic specified in the transaction that determineshow the address should be decoded.

[0031] IDE can be programmed to initiate Direct Memory Accesses (DMA).DMA cycles are “memory” reads and writes that are sent up the hub linkinterface to the MCH, snooped in the processor's caches, and targeted tosystem DRAM. Once again, IDE-initiated reads to DRAM will result insubsequent read completion packets to be delivered to the ICH over thehub link interface.

[0032] As more fully developed below, the IDE controller in ICH 110 inthe example embodiments of the invention may be adapted to handleinitialization completion notification, unlike conventional I/O datatransfer techniques which employ a CPU to handle initializationcompletion notification, so that the time the CPU dedicates toinitializing an IDE device may be reduced. Specifically, the IDEcontroller allows software (including a driver in operating systemsoftware executed in processor subsystem 101) to write specific bits ofa register while leaving other bits within the same byte unchanged. Thisis achieved by providing a “Bit Enable” field composed of a number ofbits equal to the number of bits in the register. When writing to theregister, the software specifies exactly which bits are to beoverwritten by placing a “1″ in the corresponding bit of the bit enablefield.

[0033]FIG. 3 illustrates an example in which bit location 3 of aregister must be set to “1″ and bit location 1 must be cleared to “0″.In this example, the software provides for an 8-bit value having a “1″in the bit locations of bit enable field 301 that correspond to bitlocations 3 and 1 of the register and a “0″ in the bit locations of bitenable field 301 that correspond to bit locations 2 and 0 of theregister. This enables bit locations 3 and 1 of the register to beoverwritten. It also provides a “1″ in the bit locations of data field402 that correspond to bit locations 2 and 0 of the register. (Any valuemay be provided at the bit locations of data field 302 that correspondto bit locations 2 and 0 of the register.) In the example shown in FIG.3, the bit locations of bit enable field 301 and data field 302 are inthe same position as the bit locations of the register.

[0034] Hardware associated with the register receives a data packetcontaining the bit enable field and the data field (“1010_(—)1×0×”binary in the example of FIG. 3) and overwrites the bit locations of theregister for which the enable bit in the corresponding location of thebit enable field is set. The other bit locations of the register areleft unchanged. In the example shown in FIG. 3, bit locations 3 and 1 ofthe register are overwritten with the data in the corresponding bitlocations of data field 302, while bit locations 2 and 0 of the registerretain their initial values.

[0035] Generally speaking (without reference to the example shown inFIG. 3), the bit enable field allows any combination of N register bitsto be over-written with a 2*N-bit write command with relatively simplehardware implementation. The invention can be easily applied to existinghardware structures having existing control registers by providing analternate register location for implementing the “bit-granular writes”as described above. In the case of an I/O Controller Hub (ICH) or othersuitable hardware device, the alternate register location may be placedin memory space, thereby allowing the processor to post the bit-granularwrites for use with the streamlined technique described below.

[0036] In the example embodiments of the invention described herein, themethods are utilized with either one of, or both of, IDE DMA statusregister and IDE command register. In particular, the methods are usedwhen initiating a DMA sequence for an IDE storage device.

[0037] To cause an IDE storage device or other external I/O device orperipheral to perform I/O data transfer operations, the host processorinitializes it so as to prepare it to receive I/O data transferoperation commands. To initialize the IDE storage device or otherexternal I/O device or peripheral, the host processor transmits one ormore task-file initialization commands to it as a data packet throughwhat is called the task-file register set. Each task-file initializationtakes a significant length of time to execute. One reason for this isthat each command execution is verified by the host processor before thenext command may be executed.

[0038] Transmitting information to an IDE device may involve severalindividual actions (or “writes”) to the task-file register set, each ofwhich conventionally may be processed in a 1.2 μs I/O cycle. Forexample, where seven individual initialization actions are processed byan IDE interface controller, the total I/O cycle time may be 8.4 μs(=7×1.2). The collective of these seven writes may be thought of as atask-file. The action of the CPU in performing a series of commands orI/O accesses to properly enable an IDE device for the transfer of datamay be referred to as “writing the task file.”

[0039] During the time in which the IDE interface controller devicesends a task-file I/O, the processor is blocked from generating furthercommands or receiving further requests. Under current ATA standards, theprocessor would be tied up for 8.4 μs on sending a command to an IDEdevice with seven individual I/O task-file writes. To put this wait intoperspective, a 1 Ghz processor may execute about 1,000 ordinaryinstructions in one microsecond. Thus, in the 8.4 μs a processordedicates to IDE device initialization, up to 8,400 ordinaryinstructions (=1,000×8.4) could be processed by the processor if a timea processor dedicates to IDE device command setup is reduced.

[0040] By employing a shadow register space to handle task-file I/Ocompletion, the present invention works toward reducing the time aprocessor may dedicate to command setup of an IDE device byapproximately 7.0 μs ; from 8.4 μs to approximately 1.4 μs. Moreover,the invention similarly may be employed towards reducing the time aprocessor may dedicate to the operation of devices internal and externalto a processor. Therefore, in a “fill condition,” i.e. when a task-fileregister is written and a processor must wait until the I/O cycle out tothe IDE device is completed, the register space allows the extension inthe IDE interface controller that is described in this invention tocomplete the I/O cycle to the IDE device. Thus allowing the processor toreturn to processing other tasks, such as task-file writes.

[0041] An addressing method is used to uniquely identify the source anddestination of a data transfer in desktop computer system 100 in ameaningful manner. Each device, such as a memory integrated circuit,storage device, or processor, may have its own local address space. Anaddress space may be the range of addresses that a processor or processcan access, or at which a device can be accessed.

[0042] The address space of a device bus may be dependent upon at leastthe width of the address; that is, the number of bits in the address. Adevice bus having an address width of 16 bits uniquely identifies 2¹⁶ orexactly 65,536 locations. The size of a processor's address spacedepends on the width of the processor's address bus and addressregisters. Each local address space may start at zero. Each localaddress may be mapped to a range of addresses which starts at some baseaddress in the processor's address space. Similarly, each process willhave its own address space, which may be all or a part of theprocessor's address space.

[0043] Preferably, the initialization can be streamlined so thatprocessor subsystem 101 can post an entire command sequence for settingup an I/O data transfer operation with an external I/O device orperipheral to the control registers in a controller for the external I/Odevice or peripheral. The ICH 110 preferably implements a streamliningcommand setup feature that allows an IDE driver in the operating systemexecuted by processor subsystem 101 to perform task file commands for atypical disk access using posted memory writes, instead of I/O writes.This allows CPU 104 in processor subsystem 101 to quickly complete theIDE set up and move on to other operations.

[0044] In a conventional I/O-based method, the CPU spends an average ofgreater than 1.2 microseconds per access that runs to the IDE drive asshown in FIG. 2. Although the IDE 1/0 accesses are used to initiate alldisk accesses regardless of IDE mode (i.e. PIO, DMA, UDMA), thisstreamlining feature must only be used for UDMA transfers. This allowsthe driver software to off-load CPU 104 in processor subsystem 101earlier to perform other activities while waiting for the interrupt uponthe completion of the transfer.

[0045] With six PIO command accesses and two read-modify-write accessesto the Bus Master I/O registers, CPU 104 is stalled for approximately 8microseconds at the beginning of any disk access while performing thissequence. Based on 8 posted memory writes (on a 133 MHz front-side bus)described below and used to replace the current I/O cycles, the expectedstall for the processor the time for a conventional I/O process is 7.91microseconds versus a time of only 0.18 microseconds for thestreamlining process. The result is a 98% decrease in time duration foran actual savings of 7.73 microseconds. The improvement is lessimpressive if the Fast NonData PIO mode can be used with the drives onthe IDE interface.

[0046]FIG. 5 illustrates a read/write streamlined command setup method500. FIG. 6 illustrates a timeline of the method of FIG. 5. As isreadily seen, in FIG. 6, while the ATA channel is in an active state,such as writing drive select, the CPU 104 is blocked from performingother tasks.

[0047] Method 500 may include similarities to conventional protocolsused to write a task-file. However, it differs at least insofar as itinvolves writing to a memorymapped register queue rather than I/O mappedtask-file registers.

[0048] Method 500 may be implemented in software recorded in anyreadable medium which, when executed, causes computer 100, preferablyprocessor subsystem 101, to perform method 500. In one embodiment,method 500 may be implemented through a distributed readable storagemedium containing executable computer program instructions which, whenexecuted, cause at least one of a client computer system and a servercomputer system to perform method 500. Additionally, method 500 may beimplemented though a computer readable storage medium containingexecutable computer program instructions which, when executed, cause acomputer system 100 to perform method 500.

[0049] Method 500 may begin at step 502. At step 502, method 500 mayaddress the storage device so as to command the attention of the storagedevice.. Bit four of the device/head register field indicates theselected device (DEV). Thus, step 502 may include placing the properinput at bit four (the write/drive select bit) of the device/headregister field. This bit four information is always sent to the I/Otask-file.

[0050] At step 504, method 500 may read an alt-status register of thestorage device to determine whether the storage device is busy. If it isbusy, then method 500 may return a “SRB_STATUS_BUSY” signal at step 506since the small computer system interface (SCSI) request block (SRB)field would not be clear. From step 506, method 500 may return to step504.

[0051] Under normal operations, the storage device may not be busy atthe first reading of its alt-status register. Thus, the read command ofstep 504 may be sent to the I/O task-file. Alternatively, method 500 mayreturn to step 504 up to 20,000 times. Here, the read command of step504 may be sent to the memory queue.

[0052] If the storage device is not busy, then method 500 may continueto step 508. At step 508, method 500 may determine whether the DMAengine of the storage device is active. If the BM engine of the storagedevice is active, then the BM engine may be turned off and the drivereset at step 510. If the BM engine of the storage device is not active,then method 500 may proceed to step 512.

[0053] At step 512, method 500 may calculate the block count and theprogram device. This may involve writing the block length to the memoryqueue of the sector count register field. Step 512 is distinguished fromconventional techniques in that, under conventional techniques, theblock length is written to an I/O task-file whereas step 512 includeswriting the block length to the memory queue.

[0054] At step 514, method 500 may calculate the logical block address(LBA) and the program device. This may include at least one of writingthe following registers to the memory queue: sector number, cylinderlow, cylinder high, and device/head register. Step 514 is distinguishedfrom conventional techniques in that, under conventional techniques,these registers are written to an I/O task-file whereas step 514includes writing the registers to the memory queue.

[0055] Preferably, ICH 110 implements a Command Posting FIFO for eachIDE channel. PIO write commands are posted to the FIFO by writing to thememory location specified in the IDE Command Posting Range, below. Thecommands are then executed in order on the respective interface. Thedepths of the FIFOs may be provided to software through the Primary andSecondary Posting FIFO Depth Registers. Software must use thisinformation to guarantee that the FIFO's do not overflow. The depth ofeach of the FIFO's may be fixed to a maximum (i.e., 8) entries. Onepossible sequence of writes is:

[0056] 1. Sector Count Register (02 h)

[0057] 2. Sector Number (03 h)

[0058] 3. Cylinder Low (04 h)

[0059] 4. Cylinder High (05 h)

[0060] 5. Device/Head (06 h)

[0061] 6. Command (07 h)

[0062] 7. Bus Master Status Register Interrupt Cleared

[0063] 8. Bus Master Start/Stop bit is set, and the Read/Write bit iswritten simultaneously.

[0064] These are preferably the last 8 writes in the command sequence.As described above, software must select the ATA device prior to thisspecific sequence.

[0065] At step 516, method 500 may include programming the DMAdescriptor table contents. At step 518, the command register may beprogrammed with a read or write command that may be sent to the memoryqueue rather than an I/O task-file.

[0066] At step 520, the BM engine may be programmed. This may involveclearing the BM interrupt (BMI) status bit for a specifically-accessedcontroller, such as a controller in ICH 110. Additionally, the drivetransfer protocol (DTP) of the BM engine may be set. In one embodiment,the BMI_DTP register may be set only once. Last, the BMI control may beset to a “Start/Stop Bus Master” bit.

[0067] With the BM engine programmed at step 520, method 500 may waitfor an interrupt signal from the storage device at step 522. This mayinvolve returning a “SRB_STATUS_PENDING” signal since the small computersystem interface (SCSI) request block (SRB) field would be connectedwith change. At step 524, an interrupt signal may be received.

[0068] The invention may be employed whenever a direct memory accessread or write is initiated to an ATA device. Since typical computersystems include a primary hard disk drive that may be enabled foraccesses through a direct memory access read or write, CPU performancemay improve with each use of this invention. In turn, as CPU performanceimproves, the disk access overhead that this invention works towardsreducing will equate to an ever-large performance gain.

[0069]FIG. 7 illustrates a timeline for a method of the inventionwherein task-file access is streamlined. By using this exampleembodiment of the invention, the CPU is freed up to allow processing ofother tasks.

[0070] The above embodiment can also be stored on a device or medium andread by a machine to perform instructions. The device or medium mayinclude a solid state memory device and/or a rotating magnetic oroptical disk. The device or medium may be distributed when partitions ofinstructions have been separated into different machines, such as acrossan interconnection of computers.

[0071] While the foregoing and following written and illustrateddisclosure focuses on describing example embodiments of the invention,it should be understood that the same is by way of illustration andexample only, is not to be taken by way of limitation, should not beconstrued as limiting the scope of the subject matter of the claimedinvention, and may be modified in learned practice of the invention. Thespecification and drawings are, accordingly, to be regarded in anillustrative rather than a restrictive sense. Moreover, the principlesof the invention may be applied to achieve the advantages describedherein and to achieve other advantages or to satisfy other objectives,as well. While the foregoing has described what are considered to beexample embodiments of the invention, it is understood that variousmodifications may be made therein and that the invention may beimplemented in various forms and embodiments, and that it may be appliedin numerous applications, only some of which have been described herein.It is intended by the following claims to claim all such modificationsand variations.

1. A method of writing individual bits of data to a register, saidmethod comprising: receiving bits of data in a data field, the number ofbits in the data field being equal to the number of bits in the registerand bit locations in the data field corresponding respectively to bitlocations in the register; receiving enable bits in a bit enable field,the number of enable bits in the bit enable field being equal to thenumber of bits in the register and bit locations in the bit enable fieldcorresponding respectively to bit locations in the register; andoverwiting only the bits at the bit locations of the register for whichthe enable bit in the corresponding location in the bit enable field isset with the bit in the corresponding location in the data field.
 2. Themethod recited in claim 1, wherein the register is a control registerfor a data transfer operation.
 3. The method recited in claim 2, whereinthe data transfer operation transfers data to or from an IDE storagedevice.
 4. The method recited in claim 3, wherein the control registeris an IDE DMA status register.
 5. The method recited in claim 3, whereinthe control register is a command register.
 6. The method recited inclaim 1, wherein some of the bits of said register are not overwritten.7. The method recited in claim 1, wherein the data field and the bitenable field are received simultaneously.
 8. The method recited in claim7, wherein the data field is provided at an address which is contiguouswith the address for the bit enable field.
 9. The method recited inclaim 1, wherein the data transfer operation comprises an IDE datatransfer between a processor subsystem and an external IDE storagedevice or peripheral.
 10. The method recited in claim 9, wherein theprocessor subsystem posts an entire command sequence for setting up theIDE data transfer.
 11. The method recited in claim 9, wherein the methodis carried out in an IDE controller in a bridge connected between theprocessor subsystem and the external IDE storage device or peripheral.12. A computer comprising: a processor subsystem; a device whichtransfers data to or from said processor subsystem; and a controllerconnected between said device and said processor subsystem and adaptedto control the transfer of data between said device and said processorsubsystem, said controller executing a method comprising, receiving bitsof data in a data field, the number of bits in the data field beingequal to the number of bits in a control register in the controller andbit locations in the data field corresponding respectively to bitlocations in the control register; receiving enable bits in a bit enablefield, the number of enable bits in the bit enable field being equal tothe number of bits in the control register and bit locations in the bitenable field corresponding respectively to bit locations in the controlregister; and overwiting only the bits at the bit locations of thecontrol register for which the enable bit in the corresponding locationin the bit enable field is set with the bit in the correspondinglocation in the data field.
 13. The computer recited in claim 12,further comprising a bridge between the processor subsystem and at leastsaid device, the controller being included in the bridge.
 14. Thecomputer recited in claim 13, wherein the device comprises an IDEstorage device and the bridge comprises an 110 controller hub (ICH)which controls an IDE data transfer between the processor subsystem andthe IDE storage device.
 15. The computer recited in claim 12, whereinthe processor subsystem posts an entire command sequence in thecontroller for setting up the IDE data transfer.
 16. A software programstored in a tangible medium, said program, when executed, causing acomputer to execute a method of writing individual bits of data to aregister, said method comprising: receiving bits of data in a datafield, the number of bits in the data field being equal to the number ofbits in the register and bit locations in the data field correspondingrespectively to bit locations in the register; receiving enable bits ina bit enable field, the number of enable bits in the bit enable fieldbeing equal to the number of bits in the register and bit locations inthe bit enable field corresponding respectively to bit locations in theregister; and overwiting only the bits at the bit locations of theregister for which the enable bit in the corresponding locations in thebit enable field is set with the bits in the corresponding location inthe data field.
 17. The software program recited in claim 16, whereinsaid software program comprises a driver in the operating systemsoftware executed by a processor subsystem in the computer.
 18. Thesoftware program recited in claim 17, wherein the register is a controlregister in a controller adapted to control an IDE data transferoperation between said processor subsystem and an IDE storage device.19. The software program recited in claim 17, wherein the processorsubsystem posts an entire command sequence for setting up the IDE datatransfer to the controller.